Search CORE

17 research outputs found

Emergence of Addictive Behaviors in Reinforcement Learning Agents

Author: Behzadan Vahid
Munir Arslan
Yampolskiy Roman V.
Publication venue
Publication date: 13/11/2018
Field of study

This paper presents a novel approach to the technical analysis of wireheading in intelligent agents. Inspired by the natural analogues of wireheading and their prevalent manifestations, we propose the modeling of such phenomenon in Reinforcement Learning (RL) agents as psychological disorders. In a preliminary step towards evaluating this proposal, we study the feasibility and dynamics of emergent addictive policies in Q-learning agents in the tractable environment of the game of Snake. We consider a slightly modified settings for this game, in which the environment provides a "drug" seed alongside the original "healthy" seed for the consumption of the snake. We adopt and extend an RL-based model of natural addiction to Q-learning agents in this settings, and derive sufficient parametric conditions for the emergence of addictive behaviors in such agents. Furthermore, we evaluate our theoretical analysis with three sets of simulation-based experiments. The results demonstrate the feasibility of addictive wireheading in RL agents, and provide promising venues of further research on the psychopathological modeling of complex AI safety problems

arXiv.org e-Print Archive

University of Louisville

Founding The Domain of AI Forensics

Author: Baggili Ibrahim
Behzadan Vahid
Publication venue: Digital Commons @ New Haven
Publication date: 11/12/2019
Field of study

With the widespread integration of AI in everyday and critical technologies, it seems inevitable to witness increasing instances of failure in AI systems. In such cases, there arises a need for technical investigations that produce legally acceptable and scientifically indisputable findings and conclusions on the causes of such failures. Inspired by the domain of cyber forensics, this paper introduces the need for the establishment of AI Forensics as a new discipline under AI safety. Furthermore, we propose a taxonomy of the subfields under this discipline, and present a discussion on the foundational challenges that lay ahead of this new research area

arXiv.org e-Print Archive

Digital Commons @ New Haven